Distributed web crawling

Results: 21



#Item
11Software / Distributed web crawling / Focused crawler / Information science / Web crawlers / World Wide Web

September 26, 2001 SRC Research Report

Add to Reading List

Source URL: www.cs.cornell.edu

Language: English - Date: 2002-09-01 17:56:10
12Distributed data storage / Information science / Cross-platform software / GPL / YaCy / Apache Solr / Peer-to-peer / Distributed hash table / Distributed web crawling / Software / Computing / Internet search engines

SearchEngine A Web Search Appliance with Solr and YaCy Michael Christen, [removed] ApacheCon 2012, [removed]

Add to Reading List

Source URL: archive.apachecon.com

Language: English - Date: 2012-11-25 15:54:02
13Searching / Web search engine / Search engine indexing / World Wide Web / Distributed web crawling / Focused crawler / Information science / Information retrieval / Web crawlers

Parallel Crawlers Junghoo Cho Hector Garcia-Molina University of California, Los Angeles

Add to Reading List

Source URL: ilpubs.stanford.edu

Language: English - Date: 2008-09-16 22:41:01
14World Wide Web / Internet / Information retrieval / Robots exclusion standard / Web search engine / United States House of Representatives Page / Distributed web crawling / Information science / Web crawlers / Computing

An Adaptive Model for Optimizing Performance of an Incremental Web Crawler Jenny Edwards 

Add to Reading List

Source URL: www10.org

Language: English - Date: 2001-03-23 05:17:32
15Web crawlers / Spamming / Uniform resource locator / Robots exclusion standard / Spamdexing / PageRank / Anti-spam techniques / Internet Archive / Distributed web crawling / World Wide Web / Information science / Computing

IRLbot: Scaling to 6 Billion Pages and Beyond Hsin-Tsang Lee, Derek Leonard, Xiaoming Wang, and Dmitri Loguinov ∗ Department of Computer Science, Texas A&M University

Add to Reading List

Source URL: irl.cs.tamu.edu

Language: English - Date: 2008-02-25 21:31:50
16Internet search engines / Cross-platform software / Searching / Nutch / Lucene / Web crawler / Search engine indexing / Heritrix / Distributed web crawling / Information science / Software / Information retrieval

Full Text Search of Web Archive Collections MICHAEL STACK Internet Archive The Presidio of San Francisco 116 Sheridan Ave. San Francisco, CA 94129

Add to Reading List

Source URL: archive-access.sourceforge.net

Language: English - Date: 2007-02-09 20:09:46
17World Wide Web / Heritrix / Focused crawler / Web harvesting / Web archiving / Robots exclusion standard / Web search engine / Distributed web crawling / Information science / Web crawlers / Information retrieval

PDF Document

Add to Reading List

Source URL: www.ipsyp.gr

Language: English - Date: 2013-09-23 08:37:31
18Web crawler / Invisible Web / Database / Precision and recall / Distributed web crawling / Search engine indexing / Web query classification / Document classification / Information science / Information retrieval / Information

Query- vs. Crawling-based Classification of Searchable Web Databases 1 Luis Gravano

Add to Reading List

Source URL: qprober.cs.columbia.edu

Language: English - Date: 2013-03-14 11:23:16
19Computing / Information retrieval / Focused crawler / Invisible Web / Robots exclusion standard / Web search engine / Internet Archive / Distributed web crawling / Web harvesting / Information science / World Wide Web / Web crawlers

Design and Implementation of a High-Performance Distributed Web Crawler Vladislav Shkapenyuk

Add to Reading List

Source URL: www.cis.poly.edu

Language: English - Date: 2001-08-13 18:57:45
20Computing / Information retrieval / Focused crawler / Invisible Web / Robots exclusion standard / Web search engine / Internet Archive / Distributed web crawling / Web harvesting / Information science / World Wide Web / Web crawlers

PDF Document

Add to Reading List

Source URL: cis.poly.edu

Language: English - Date: 2001-08-13 18:57:45
UPDATE